Balancing Fine- and Medium-Grained Parallelism in Scheduling Loops for the XIMD Architecture

نویسندگان

  • Chris J. Newburn
  • Andrew S. Huang
  • John Paul Shen
چکیده

This paper presents an approach to scheduling loops that leverages the distinctive architectural features of the XIMD, particularly the variable number of instruction streams and low synchronization cost. The classical VLIW and MIMD architectures have a fixed number of instruction streams, each with a fixed width. A compiler for the XIMD architecture can exploit fine-grained parallelism within each instruction stream and medium-grained parallelism between instruction streams. Its task is to schedule code using the instruction stream widths best suited for the available program parallelism in each loop nest. The combination of instruction stream widths is selected to make the best utilization of the machine resources, and hence to minimize the execution time for the whole schedule. A new loop scheduling technique is presented for the XIMD called iteration mapping. It applies and extends doacross to use more than one processing unit per instruction stream, and to exploit both fineand medium-grained parallelism in scheduling nested loops. For the benchmarks used, the extracted parallelism using iteration mapping is twice that extracted using only finegrained compaction within basic blocks for an eight-wide machine. Near-linear speedups are achieved for many non-vectorizable loops.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iteration Mapping: Loop Software Pipelining on an XIMD

The multiple instruction streams, low synchronization cost and synchronous nature of the XIMD (variable instruction stream, multiple data stream) architecture create an opportunity for a new architecture-compiler interface. As an extension to the VLIW (Very Long Instruction Word) architecture, the XIMD can exploit all VLIW scheduling techniques but these do not take full advantage of the unique...

متن کامل

Implementing Fine/Medium Grained TLP Support in a Many-Core Architecture

We believe that future many-core architectures should support a simple and scalable way to execute many threads that are generated by parallel programs. A good candidate to implement an efficient and scalable execution of threads is the DTA (Decoupled Threaded Architecture), which is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a hardware scheduling unit and r...

متن کامل

Scheduling DAG's for Asynchronous Multiprocessor Execution

A new approach is given for scheduling a sequential instruction stream for execution “in parallel” on asynchronous multiprocessors. The key idea in our approach is to exploit the fine grained parallelism present in the instruction stream. In this context, schedules are constructed by a careful balancing of execution and communication costs at the level of individual instructions, and their data...

متن کامل

Efficient Exploitation of Parallelism on Pentium III and Pentium 4 Processor-Based Systems

Systems based on the Pentium III and Pentium 4 processors enable the exploitation of parallelism at a fineand medium-grained level. Dualand quad-processor systems, for example, enable the exploitation of mediumgrained parallelism by using multithreaded code that takes advantage of multiple control and arithmetic logic units. Streaming Single-Instruction-Multiple-Data (SIMD) extensions, on the o...

متن کامل

Compiling Regular Computations to Fine-Grained Linear Processor Arrays

Fine-grained linear processor arrays are an important class of architectures for obtaining high performance on computationally intensive algorithms with large data sets, as found prevalently in digital signal processing and scientiic computing. The vast number of processing elements on these architectures provides a immense amount of potential parallelism but at the price of limited interconnec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993